Discriminating and Identifying Codes 
in the Binary Hamming Space 



Irene Charon 

GET - Telecom Paris & CNRS - LTCI UMR 5141 
46, rue Barrault, 75634 Paris Cedex 13 - France 
irene . charon@enst . f r 

Gerard Cohen 

GET - Telecom Paris & CNRS - LTCI UMR 5141 
46, rue Barrault, 75634 Paris Cedex 13 - France 
gerard.cohen@enst.fr 

Olivier Hudry 

GET - Telecom Paris & CNRS - LTCI UMR 5141 
46, rue Barrault, 75634 Paris Cedex 13 - France 
olivier . hudry@enst . fr 

Antoine Lobstein 

CNRS - LTCI UMR 5141 & GET - Telecom Paris 
46, rue Barrault, 75634 Paris Cedex 13 - France 
antoine. lobstein@enst . fr 



Abstract 

Let F n be the binary n-cube, or binary Hamming space of dimen- 
sion n, endowed with the Hamming distance, and £ n (respectively, O n ) 
the set of vectors with even (respectively, odd) weight. For r > 1 and 
x E F n , we denote by B r (x) the ball of radius r and centre x. A code 
C C F n is said to be r-identifying if the sets B r (x)nC, x e F n , are all 
nonempty and distinct. A code C C £ n is said to be r-discriminating 
if the sets B r (x) n C, x £ O n , are all nonempty and distinct. We 
show that the two definitions, which were given for general graphs, are 
equivalent in the case of the Hamming space, in the following sense: for 
any odd r, there is a bijection between the set of r-identifying codes 
in F n and the set of r-discriminating codes in F n+1 . We then ex- 
tend previous studies on constructive upper bounds for the minimum 
cardinalities of identifying codes in the Hamming space. 
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1 Introduction 

We define identifying and discriminating codes in a connected, undirected 
graph G = (V,E), in which a code is simply a nonempty subset of vertices. 
These definitions can help, in various meanings, to unambiguously determine 
a vertex. The motivations may come from processor networks where we 
wish to locate a faulty vertex under certain conditions, or from the need to 
identify an individual, given its set of attributes. 

In G we define the usual distance d(v\,v%) between two vertices v\,V2 G 
V as the smallest possible number of edges in any path between them. For 
an integer r > and a vertex v G V, we define B r (v) (respectively, S r (v)), 
the ball (respectively, sphere) of radius r centred at v, as the set of vertices 
within distance (respectively, at distance exactly) r from v. Whenever two 
vertices v% and V2 are such that v% G B r [y2) (or, equivalently, v-i G B r (vi)), 
we say that they r-cover each other. A set X C V r-covers a set Y C V if 
every vertex in Y is r-covered by at least one vertex in X. 

The elements of a code C C V are called codewords. For each vertex 
v G V, we denote by 

K c , T {v) = CnB r {v) 

the set of codewords r-covering v. Two vertices v\ and v-i with Kc r ( v i) 
Kc,r{v2) are said to be r-separated by code C, and any codeword belonging 
to exactly one of the two sets B r (v\) and B r {v2) is said to r- separate v\ 
and V2- 

A code C C V is called r-identifying [12J if all the sets Kc, r (v), v G V, 
are nonempty and distinct. In other words, every vertex is r-covered by at 
least one codeword, and every pair of vertices is r-separated by at least one 
codeword. Such codes are also sometimes called differentiating dominating 
sets 0. 

We now suppose that G is bipartite: G = (V = IU A,E), with no edges 
inside I nor A — here, A stands for attributes and I for individuals. A code 
C C A is said to be r- discriminating [1] if all the sets Kc )T {i)i i G /, are 
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nonempty and distinct. From the definition we see that we can consider 
only odd values of r. 

In the following, we drop the general case and turn to the binary Ham- 
ming space of dimension n, also called the binary n-cube, which is a regular 
bipartite graph. First we need to give some specific definitions and notation. 

We consider the n-cube as the set of binary row-vectors of length n, and 
as so, we denote it by G = (F n ,E) with F = {0,1} and E = {{x,y} : 
d(x,y) = 1}, the usual graph distance d(x,y) between two vectors x and y 
being called here the Hamming distance — it simply consists of the number 
of coordinates where x and y differ. The Hamming weight of a vector x is its 
distance to the all-zero vector, i.e., the number of its nonzero coordinates. A 
vector is said to be even (respectively, odd) if its weight is even (respectively, 
odd), and we denote by 8 n (respectively, O n ) the set of the 2 n_1 even (re- 
spectively, odd) vectors in F n . Without loss of generality, for the definition 
of an r-discriminating code, we choose the set A to be £ n , and the set / to 
be O n . Additions are carried coordinatewise and modulo two. 

We denote by n (respectively, l n ) the all-zero (respectively, all-one) 
vector of length n. Given a vector x € F n , we denote by tt(x) its parity- 
check bit: ir(x) = if x is even, tt(x) = 1 if x is odd. Therefore, if | stands for 
concatenation of vectors, x\ir(x) is an even vector. For two sets X C F ni , 
Y C F ri2 , the direct sum of X and Y, denoted by X © Y, is defined by 
X © Y = {x\y £ F ni+n2 : x € X,y € Y}. Finally, we denote by M r (n) 
(respectively, D r (n)) the smallest possible cardinality of an r-identifying 
(respectively, r-discriminating) code in F n . 

In Section [21 we show that in the particular case of Hamming space, the 
two notions of r-identifying and r-discriminating codes actually coincide for 
all odd values of r and all n > 2, in the sense that there is a bijection between 
the set of r-identifying codes in F n and the set of r-discriminating codes 
in F n+ . In Section we give various methods for constructing identifying 
codes, thus obtaining, in SectionHJ upper bounds on M r (n), of which several 
are new. These bounds are summarized in Tables at the end of the paper. 

2 Identifying is discriminating 

As we now show with the following two theorems, for any odd r > 1, any 
r-identifying code in F n can be extended into an r-discriminating code in 
F n+1 , and any r-discriminating code in F n can be shortened into an r- 
identifying code in F n ~ l . First, observe that r-identifying codes exist in F n 
if and only if r < n. 

Theorem 1 Let n > 2,p > be such that 2p + 1 < n, let C C F n be a 
(2p + 1) -identifying code and let 

C' = {c|vr(c) : c G C}. 
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Then C 



is (2p + 1) -discriminating in 



F n + l . Therefore, 



D 2p+1 (n+l) <M 2p+1 {n). 



(1) 



Proof. Let r = 2p + l. By construction, C contains only even vectors. We 
shall prove that (a) any odd vector x G O n+1 is r-covered by at least one 
codeword of C; (b) given any two distinct odd vectors x, y G O n+1 , there is 
at least one codeword in C which r-separates them. 

(a) We write x = X\\x 2 with x\ G F n and x 2 G F. Because C is 
r-identifying in F n , there is a codeword c G C with c£(xi,c) < r. Let 
c' = c\tt(c). 

If d(xi,c) < r — 1, then whatever the values of x 2 and 7r(c) are, we 
have d(x,c') < r; we assume therefore that d(xi,c) = r = 2p + 1, which 
implies that x\ and c have different parities. Since x\\x 2 and c|-7r(c) also 
have different parities, we have x 2 = ir(c) and d(x, c') = r. So the codeword 
c' G C r-covers x. 

(b) We write x = xi\x 2 , y = yi\y 2 , with x\,y\ G F n , x 2 ,y 2 G F. Since 
C is r-identifying in F n , there is a codeword c G C which is, say, within 
distance r from x\ and not from y\\ d{x\,c) < r, d(y±,c) > r. Let c' = 
c\-k{c). 

For the same reasons as above, x is within distance r from c' , whereas 
obviously, d(y,c') > d{y\,c) > r. So c' G C" r-separates x and y. 

Inequality (pQ) follows. □ 

Theorem 2 Let n > 3,p > be such that 2p + 2 < n, let C C f n 6e a 
(2p+ 1)- discriminating code and let C C F n_1 6e any code obtained by the 
deletion of one coordinate in C. Then C is (2p + 1) -identifying in F n ~ l . 
Therefore, 



Proof. Let r = 2p + 1. Let C C £ n be an r-discriminating code and 
qi q pn-i ^ e CO( ^ e obtained by deleting, say, the last coordinate in C. 
We shall prove that (a) any vector x G F™" 1 is r-covered by at least one 
codeword of C"; (b) given any two distinct vectors x,y G F n ~ l , there is at 
least one codeword in C which r-separates them. 

(a) The vector x\{^{x) + 1) G F n is odd. As such, it is r-covered by a 
codeword c = c'\u G C C £ n : c' G C", u = vr(c'), and d{x\{-n{x) + 1), c) < r. 
This proves that x is within distance r from a codeword of C". 

(b) Both x\(7r(x) + 1) and y|(vr(y) + 1) are odd vectors in F n , and there 
is a codeword c = c'\u G C C £" n , with c' G C, u = vr(c'), which r- 
separates them: without loss of generality, d(x\(ir(x) + l),c) < r whereas 
d(y|(7r(y) + 1), c), which is an odd integer, is at least r + 2. Then obviously, 
d(x,c') < r and d(y,c') > r + 1, i.e., there is a codeword in C which r- 
separates x and y. 

Inequality (|2|) follows. □ 



M 2p+ i(n- 1) < D 2p+1 (n). 



(2) 
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Corollary 3 For all n >2 and p > such that 2p + 1 < n, we have: 

D2 P +i(n + 1) = M 2p+ i(n). 

□ 

It follows that, in the Hamming space, the complexity of problems on dis- 
criminating codes is the same as that for identifying codes; in particular, it 
is known [Tl] that deciding whether a given code C C F n is r-identifying is 
co-NP-complete. 

We now turn to constructions of identifying codes in the n-cube, since 
this is equivalent to our initial goal of constructing discriminating codes. 

For previous works, we refer to, e.g., [JJ-[3], [7], [TO], [Tl] or [12]. In the 
recent [8], tables for exact values or bounds on Mi (re), 2 < n < 19, and 
M2(n), 3 < n < 21, are given. 

3 Constructing identifying codes 

We use the notation (r, n) or (r, n)K for a code in F n which is r-identifying 
and has K elements. Our constructions will use Theorem [5] below, as well 
as various heuristics. 

3.1 Extending an identifying code 

In the constructions of Theorems [5] and [6] below, we use a new definition: a 
code is called r-separating if every pair of vertices is r-separated by at least 
one codeword [21 Sec. 3] (we do not require anymore that every vertex be 
r-covered by at least one codeword). The following remark and lemma are 
easy. 

Remark 1. 

(i) For 0<r<n — 1, a code C C F n is r-separating if, and only if, it 
is also (n — r — l)-separating, because B r {x) = F n \ i? n _ r _i(x + l n ) for all 
x e F n . 

(ii) Since a separating code is such that at most one vertex can be covered 
by zero codeword, the size of an optimum r-separating code in F n is M r (n) 
or M r (n) — 1, and we have: 

Mnax{r,n-r-l}(^) — -^min{r,n— r— 1} 

(n) < M max{ 

r,n— r— 1} 

(re) + 1, (3) 

i.e., the symmetry, with respect to [(n— l)/2j , observed for separating codes, 
still holds, within one, for identifying codes. 

Lemma 4 For all p > 1 and A G {0, 1, ... ,p — 1}, the set F p \ {0 P } is 
A- separating. □ 
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The following theorem is inspired by |12|. Th. 9] and [U Ex. 2 and Th. 4]. 
Starting with an (r, n) code C, we intend to see how the direct sum C © F p 
can be used for constructing an (r, n + p) code. In construction C2, k is an 
additional parameter on which we can act. 

More comments on how to understand and use this theorem are given 
after its statement. 

Theorem 5 Let r > 1, p > 1, and k G {0, 1, . . . ,p — 1}; let C be an (r, n) 

code and 

X p = {x G F n : Vc G C, d(x, c) < r - p or d(x, c) > r}. 

Construction CI: Let Y p C F n be a (minimum) set such that for every 
x G X p there exists y (zY p with r — p + 1 < d(x, y) < r. Then 

C'={C® F p ) U (Y p © (F p \ {0 p })) (4) 

is (r, n + p) . 

Construction C2: Let Y p ^ C F n be a (minimum) set such that for every 
x G X p there exists y G Y p ^ with d(x, y) = r—k, and let C p ^ be a (minimum) 
k-separating code in F p . Then 

C' = {C® F p ) U {Y P)k © C P)h ) (5) 

is (r, n + p). 

Proof. See the proof of Theorem EJ which contains Theorem [5] as a parti- 
cular case. □ 

Theorem [S] calls for several remarks, in order to make its dry technicity more 
friendly. 

Remark 2. Ideally, X p = 0; then C © F p is (r, n + p). This is Th. 4 in [8] 
(Th. 1 in [3] for r = 1). This is the case as soon as p > r + 1, cf. Cor. 3 
in [8] (Th. 2 in [3] for r = 1). Therefore we can limit ourselves to 

p < r. 

On the other hand, we have 

X\ 5 X2 5 ••• 5 X r , 

so the smaller the number p, probably the more difficult to jump to length 
n + p without having a large set Y p or Y p ^. 

Remark 3. In construction CI, we build a minimum set Y p using the union 
of p spheres of radii ranging from r — p+ 1 to r, whereas in construction C2, 
for Y p & we use only one sphere of radius r — k. We can therefore hope for 



6 



a set Y p (much) smaller than each set Y p ^. The price to pay is that \Y p \ has 
to be multiplied by 2 P — 1, whereas |K,&| has a (much) smaller factor. 

When k = 0oik=p — 1, the smallest /c-separating codes in have size 
2 P — 1, and construction C2 is not better than construction CI; therefore, 
for construction C2 we can limit ourselves to the cases 

1 < k <p- 2, 3<p<r. 

For different values of p and k, it seems very difficult to compare construc- 
tions CI and C2, or constructions C2 between themselves. For a fixed p, k 
varies from 1 to p — 2. When k increases, up to \_(p — l)/2j , it may be that 
\Yp t k\ increases and \C p ^\ decreases (and, by Remark l(i) before Theorem [5j 
in this case \C P ^\ would increase when A: ranges from [(p — l)/2j + 1 top — 2); 
but actually the former hypothesis highly depends on particular situations 
(see Example 1 below), and the latter, more general, remains to be proved. 

Example 1. In F 10 , consider the five vectors x\ = 1 2 |0 8 , x 2 = 2 |1 2 |0 6 , 
x 3 = 4 |1 2 |0 4 , x 4 = 6 |1 2 |0 2 , x 5 = s 1 1 2 . Then 10 is at distance two from 
each of them, but it is easy to see that it is impossible to find a vector which 
is at distance one from each of them or a vector which is at distance three 
from each of them. So, if X p = {xi, x 2 , x 3 , X4, X5}, then we have \Y Ptr \ = 5, 
\Y Pjr -i\ > 1, \Y P j-- 2 \ = 1 and \Y p>r - 3 \ > 1. 

This could indicate that, in the absence of information on |lp fc|, a reasonable 
bet is to take k = |_(p — l)/2j , assuming that \C P ^\ is minimum for this k. 
Let us give two small examples. 

Example 2. We use the notation of Theorem El 

- Case p = 3; r > 3, k = 1. 

I3 is such that d(x, y) = r — 2, r — 1 or r, and IY3I is multiplied by 7. 
Y^i is such that d{x,y) = r — 1, and |l3,i| is multiplied by Mi(3) — 1 = 3: 
£3,1 = {000,001,100} is 1-separating in F 3 (but not 1-identifying: 111 is 
not 1-covered by C^i). 

- Casep = 5; r > 5, k G {1,2,3}. 

Y5: d(x, y) G {r — 4, r — 3, r — 2, r — 1, r}, and IY5I multiplied by 31. 
Y 5)1 : d(x,y) = r - 1, |y 5) i| multiplied by Mi (5) = 10 or by Mi(5) -1 = 9. 
Y 5 x- d(x,y) = r - 2, |F 5)2 | multiplied by M 2 (5) = 6 or by M 2 (5) -1 = 5. 
^3: d(x,y) = r - 3, \Y 5j3 \ multiplied by Mi(5) = 10 or by Mi(5) -1 = 9. 

Remark 4. The definition of C shows that \C\ will have a factor 2 P , so it 
seems best, in general, to take a code C as small as possible. However, it 
may be that a larger C, together with a (smaller) X p inducing a smaller Y p 
or Y p ^, gives better results. In practice, since one cannot try everything, we 
were led to use the best identifying codes at our disposal. 

Open problem. Among all (r,n) codes C with \C\ = M r (n), is there 
at least one such that the set X r defined in Theorem [S] is empty? If the 
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answer is YES, then M r {n + r) < 2 r M r (n); in particular, we would have 
Mi(n + 1) < 2Mi(n). Could this be true for X p for any p G {1, . . . , r}, so 
that we would have M r {n + p) < 2 p M r (n)? 

It is possible to generalize the previous construction, changing both length 
(from n to n + p) and radius (from n to r\ + ^2), the case r2 = being 
exactly Theorem [5j 

Theorem 6 Let r\ > p > r2 > 1, and A; £ {0, 1, . . . ,p — 1}; let C be an 

(n, n) code and 

Xp,r 2 = {x £ F n : Vc € C, d(x, c) < r\ — p + r2 or d(x, c) > r\ + re- 
construction CI: Let Y Pt r 2 C F n 6e a (minimum) set such that for every 
x G X p ^ 2 there exists y G YJ, )r2 uiii/i ri — p + r2 + 1 < d(x, y) < r\ + r2- Then 

C={C(BF p )u(Y p ^(B(F p \{O p })) 

is (ri + r 2 , n + p) . 

Construction C2: Let l^,r 2 ,fc ^ F n ^ e a (minimum) set such that for 
every x G X PtT2 there exists y G Y p ,r 2 ,k with d(x,y) = r± + r2 — k, and let 
C Pt k be a (minimum) k- separating code in F p . Then 

C' = (C® FP) U (Y p , r2 , fc e c Ptk ) 

is (ri + r 2 , n + p) . 

Proof. First, we prove, in both constructions, CI and C2, that any x G F n+P 
is (n +r2)-covered by a codeword in C' . We write x = x\\x2 with xi G -F™, 
X2 G F p . Because C is ri-identifying in F n , there is a codeword c G C 
such that d(c,xi) < r\. Therefore, d(c\x2, xi\x2) < n < n + r2, with 

Next, we prove that, given any two vectors x,y G F n+P (x 7^ y), there 
is a codeword in C" which (ri + r2)-separates them. We write x = x\\x2, 
V = Vi\V2-, with x\,y\ G F™, 22^2 G F p . We distinguish between four 
cases. The first three cases, (i) — (iii), work for both constructions CI and C2, 
because only C © F p is needed. 

(i) xi ^ yx, x<z ^ j/2- Then there is a codeword c G C such that, 
say, d(c,xi) < r\ and d(c, yi) > r\. If r2 < p — 1, then two spheres with 
radius r2 and distinct centres are different in F p , and one is not included in 
the other. So there is a vector v G F p which is within distance r2 from X2 
and not from y2- If r2 = p, we take v = y2 + l p , so that d(v,y2) = T2 and 

d(v,2 2 ) < ^2- 

In both cases, d(c\v, xi\x2) < n + r2 and d(c|t>, yi |y2) > n + ?"2> with 
c|u G C © ^ C C 

(ii) xi / yi, X2 = y2- Apply the argument in (i) with v = X2 + l r2 |0 p ~ r2 . 
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(iii) X2 7^ U2 and x\ = y\ ^ Xp iT2 . Then there is a codeword c G C such 
that 7*1 — p + r 2 + 1 < d(c, x%) < n + r2- If we set A = n + r2 — d(c, xi), we 
see that < A < p— 1. Therefore, as in case (i), we can find a vector v G i 71 ^ 
which is within distance A from X2 and not from y^. Now d(c|t>, xi |x 2 ) < 
d(c, xi) + A = n + r 2 and d(c|u, xi|y2) > d(c, x{) + A = ri + r 2 , with 
c|u G c © FP C C". 

(iv) x 2 / y 2 and x\ = y\ G X p ,r 2 - 

In construction CI, there is a vector z G lj, )r2 such that r\ — p + r2 + 1 < 
d(z, xi) < r\ + r2. Then if we set A = T\ + T2 — d(z,x\), we see that 
< A < p — 1, and by Lemma there is a vector t> G -F p \ {0 P } which is 
within distance A from X2 and not from y2, or the other way round. Then 
d(z\v, xi\x2) < d(z,x\) + A = ri + r2 and d(z\v, x\ I2/2) > ^(^,^1) + A = 
r\ + r2, or the other way round, with z\v G Y p>r2 © (F p \ {0 P }) C C', and we 
have proved that x and y are (7*1 + ^-separated by C. 

In construction C2, there is a vector z G 5^,,r 2 ,* such that ti(z, x\) = 
r\ + T2 — k and a codeword c G C Pi fc such that, say, d{c, X2) < k and 
d(c, 2/2) > fc. Then d(z|c, X1IX2) < r\ + r2 and d(z|c, a?i I3/2) > r i +^2, 

with z|c G 5j5,r 2 ,A: © Cp,fc ^ C- ^ 

3.2 Heuristics : noising and greedy 

We have mentioned at the end of Section [2] a result on complexity which 
suggests that constructing good identifying or discriminating codes in the 
Hamming space might be hard. 

Here, we use two different heuristic methods in order to build good 
identifying codes from scratch, noising and greedy. 

Noising algorithms have already been used in [6] for the construction of 
identifying codes in various grids; they constitute a family of metaheuristics, 
of which one is a generalization of simulated annealing [5] . Another of these 
consists of the following. Once r, n and a number of codewords, c, have been 
fixed, we consider codes C C F n with c codewords, and we define NC(C) as 
the number of vectors which are not r-covered by C, NS(C) as the number 
of pairs of vectors not r-separated by C, and the evaluation function 

f(C) = NC(C) + NS(C), 

which we try to make equal to zero. An initial random code is chosen, which 
will be the current code C. We iteratively modify the current code, using 
an elementary transformation which consists in replacing a codeword by a 
noncodeword, thus keeping \C\ = c. 

Now when do we accept an elementary transformation? We cyclically go 
through all codewords: after looking into the last codeword, we start again 
with the first one. Looking into a codeword m means that we go through 
all vectors s in F n \ C, we note C m>s = C \ {m} U {s}, and we compute 

A(C,m,s) = f(C m , s )-f(C). 
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For each s, we also compute a noised value 

A noise (C, m, s) = A(C, m, s) + (p x ln(i?)), 

where p is a tuning parameter which we make decrease, and i? is a number 
which is randomly chosen for each new elementary transformation (see below 
for more details). 

If there is a vector s for which A(C, m, s) < 0, then we keep a vector so 
which minimizes A(C, m,s). 

If for all vectors s, we have A(C, m, s) > 0, then we look for a vector so 
which minimizes A no ; se (C, m, s), and we keep so only if A no i se (C, m, sq) < 0. 

If a vector so has been found in one of the two cases above, then we apply 
the elementary transformation with C, m and So, so that C becomes C\{m} 
U{so}- Otherwise, the current code is not modified after looking into m. 
After each accepted elementary transformation, we check the evaluation 
function of the current code: if /(C) = 0, then C is r-identifying. 

If we have found an identifying code, we reinitialize the process by re- 
moving from the current code C a codeword m which minimizes f(C\{m}), 
and we cyclically go through the remaining codewords. 

The parameter R is a real number, randomly chosen, in a uniform way, 
between zero and one; the noising rate p is a positive real number which we 
decrease arithmetically from an initial value down to zero, and for each value 
of p, we cyclically go through the codewords a certain number of times. 

Greedy algorithms are based on the following simple idea: starting from an 
empty code C, at each step we choose to add in C a codeword m which will 
maximize /(C) — /(CU {m}). In case of a tie, the choice is made at random. 

4 Results 

We give tables of lower and upper bounds on M r {n) for l<r<5, 1 < n < 
21. There are boldface figures when the exact value is known. Up to now, 
the most extensive tables (r = 1, n < 19, and r = 2, n < 21) had been given 
in@. 

4.1 Using heuristics 

The upper bounds which are marked by a star in our Tables were obtained 
by noising methods, whereas a double star indicates a result obtained by a 
greedy algorithm. For instance, the code consisting of the length-9 binary 
expressions of the following 114 integers 
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u 


i 

_L 


s 
o 


1 4 


1 7 


90 


93 


9Q 


31 


39 


37 


3Q 


*± o 


4Q 


oo 


ou 


70 


79 


73 


7"i 


7Q 


89 


84 

04: 




1 m 


1 1 8 

no 


1 90 


1 91 


1 99 


1 9fi 


1 9Q 




1 3Q 


1 40 


1 49 

±4:Z 


148 

±4:0 


154 


157 


172 


177 


182 


183 


186 


188 


194 


209 


215 


216 


219 


222 


226 


227 


228 


233 


239 


240 


247 


263 


264 


267 


268 


274 


276 


295 


297 


300 


306 


314 


317 


319 


323 


325 


339 


344 


348 


350 


352 


358 


364 


367 


368 


369 


374 


383 


391 


393 


395 


404 


405 


406 


409 


414 


416 


418 


420 


425 


435 


440 


448 


452 


453 


458 


461 


467 


475 


485 


489 


490 


494 


495 


499 


508 


509 


510 















is a (1,9)114 code obtained by noising. All our best codes can be found, in 
the same form, at 

http://www.infres.enst.fr/~charon/identifyingNcube.html 

4.2 Applying Theorem [3] 

As more or less direct consequences of the results obtained by noising and 
greedy methods, we also obtain the following results — note that the various 
sets Yi,Yij below are obtained via a greedy-type algorithm. 

(1) Using [HI Cor. 3] ([3j Th. 2] for r = 1), mentioned in Remark 2: 

Mi (21) < 4Mi(19) < 262144. (6) 

M 2 (19) < 8M 2 (16) < 14864; M 2 (20) < 16M 2 (16) < 29728; (7) 

M 2 (21) < 32M 2 (16) < 59456. (8) 

M 3 (18) < 16M 3 (14) < 2896; M 3 (19) < 32M 3 (14) < 5792; (9) 

M 3 (20) < 64M 3 (14) < 11584; M 3 (21) < 128M 3 (14) < 23168. (10) 

M 4 (19) < 32M 4 (14) < 2432; M 4 (20) < 64M 4 (14) < 4864; (11) 

M 4 (21) < 128M 4 (14) < 9728. (12) 

M 5 (19) < 64M 5 (13) < 1792; M 5 (20) < 128M 5 (13) < 3584; (13) 

M 5 (21) < 256M 5 (13) < 7168. (14) 
(2a) Because we have a (1, 13)1322 code with X\ = 0, we have 

Mi (14) < 2 • 1322 = 2644. (15) 
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(2b) We have a (1, 15)4848 code with \X\\ = 128; unfortunately, because of 
the distance distribution in Xi, it is impossible to obtain a set Y\ with fewer 
than 128 elements, and therefore, by construction CI: 

Mi (16) < 2 -4848 + 128 = 9824. (16) 

(2c) Because the (1, 19)65536 code from [8J is such that every vector is 1- 
covered by at least two codewords, we have X± = and 

Mi (20) < 2 • 65536 = 131072. (17) 

(2d) We have a (2,16)1858 code with \X±\ = 441 and we found a corres- 
ponding set Y\ with 151 elements; therefore, by construction CI: 

M 2 (17) < 2 • 1858 + 151 = 3867. (18) 

The same (2,16)1858 code has \X%\ = 283, with IY2I = 105, consequently: 

M 2 (18) < 4- 1858 + 105 -3 = 7747. (19) 

(2e) We have a (3, 14)181 code with |jfi| = 60 and a set Y\ with 13 elements; 
therefore, 

M 3 (15) < 2 • 181 + 13 = 375. (20) 
This (3, 14)181 code has = 6, a set Yi with 4 elements, and we obtain: 

M 3 (16) < 4- 181 + 4- 3 = 736. (21) 

The same (3, 14)181 code has X 3 = 0, and 

M 3 (17) < 8- 181 = 1448. (22) 

(2f) We have a (4, 14)76 code with \X X \ = 26, |Yi| = 4, yielding 

M 4 (15) < 2 • 76 + 4 = 156. (23) 

The same (4,14)76 code has X 2 = X 3 = X 4 = {7577,8802}; these two 
numbers represent two length-14 vectors at distance 13 from one another, 
so all the sets Y{,Yij have size two for i = 2,3,4. In particular, IY2I = 
1^3,1 1 = |^4,i I = 2; therefore, by construction CI: 

M 4 (16) < 4 • 76 + 2 • 3 = 310, (24) 

and by construction C2: 

M 4 (17) < 8 • 76 + 2 • 3 = 614, M 4 (18) < 16 • 76 + 2 • 6 = 1228, (25) 

because optimum 1-separating codes have size three in F s (see Example 2) 
and have size six in F 4 — this comes from Remark 1(h) on separating codes 
and the fact that Mi (4) = 7 and M 2 (4) = 6 (see Tables 1 and 2). 
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(2g) We have a (5, 13)28 code with |Xi| = 43, |Yi 



4, yielding 



M 5 (14) < 2 • 28 + 4 = 60. 



(26) 



This (5, 13)28 code has \X 2 \ = 1, \Y 2 1 = 1) an d therefore 



Af 5 (15) < 4-28 + 1-3 = 115. 



(27) 



This same (5, 13)28 code has X% = X 4 = A+, = 0, and so: 

M 5 (16) < 8-28 = 224, M 5 (17) < 16-28 = 448, M 5 (18) < 32-28 = 896. (28) 

All the nonempty sets Xj, Yi, Yij mentioned above are given at 
http://www.infres.enst.fr/~charon/identifyingNcube.html 

4.3 Further improvements: removing codewords 

Perhaps Theorems [5] and [6] can be sharpened, since in practice we observe 
(with the help of a computer) that the sizes of several codes obtained by 
Theorem [5] can be reduced by simply removing some of their codewords, 
which are "useless" . This can also be done with two of the codes obtained 
by a greedy algorithm. 

As a consequence, we have new upper bounds for some values of n and r, 
which are marked by a triple star in the Tables. The corresponding codes 
can be found at 

http:/ /www. infres.enst.fr/~charon/identifyingNcube. html 
We can observe that when r increases, the reductions can be drastic - 
almost 50% in the case r = 5, n = 18 ! 



4.4 Re-applying Theorem [5] 

We can again use Theorem with the newly improved codes obtained in 
Section 14.31 

(3a) We have a (2, 20)29346 code with X 1 = 0, and so 



(3b) We have a (3, 19)5532 code with X\ = X 2 = 0, and so 

M 3 (20) < 2 • 5532 = 11064, M 3 (21) < 4 • 5532 = 22128. (30) 
(3c) We have a (4, 18)1045 code with \X X \ = 2, |Yi| = 2, yielding 



M 2 (21) < 2 • 29346 = 58692. 



(29) 



M 4 (19) < 2 • 1045 + 2 = 2092. 



(31) 



This (4, 18)1045 code has X 2 = X 3 = and therefore 



M 4 (20) < 4 • 1045 = 4180, M 4 (21) < 8 • 1045 = 8360. 



(32) 
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(3d) We have a (5, 18)454 code with |Xi| = 1, |Yi 



1, yielding 



M 5 (19) < 2 • 454 + 1 = 909. 



(33) 



This (5, 18)454 code has \X2\ = 1, l^l = 1 ; an d therefore 



Af 5 (20) < 4 • 454 + 1 • 3 = 1819. 



(34) 



This same (5, 18)454 code has X3 = 0, and so: 



M 5 (21) < 8-454 = 3632. 



(35) 



Due to time and space limitations, we could not try to remove codewords 
from these new codes. 

4.5 Tables 

We give our results for l<r<5, r + l<n<21. For some values of r 
and n, we give two upper bounds, the first one from Section H21 the second 
one from Sections 14.31 or 14.41 so that one can see how we used Theorem \E\ 
then possibly removed codewords and possibly reused Theorem [5j 

We think that there is still room for ameliorations, and we encourage 
the reader to improve on these upper bounds. 

Key to Tables 

Lower bounds Upper bounds 



a [121 Th. l(iii)] 




b P21 Th. 2] 
c [EH Th. 3] 
d [31 Th. 4] 
e Th. 11] 



f M n _i(n) = T - 1 [2, Th. 5] 



g [21 Th. 6] 
h Table 4] 
i P31 Cor. 4] 
j [T31 Cor. 5] 
k p31 Cor. 7] 



I by © and Mi (5) = 10 
m pH Table 3.1] 
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n 


lower bound 


1st upper bound 


2nd upper bound 


2 


a 3 


3 B 




3 


b 4 


4 A 




4 


d 7 


7 C 




5 


b 10 


10 A 




6 


c 18 


19 D 




7 


e 32 


32 E 




8 


c 56 


62 G,F 




9 


c 101 


114* 




10 


c 183 


211* 




11 


c 337 


352 F 




12 


c 623 


688* 




13 


c 1158 


1322* 




14 


c 2164 


2644 mm 




15 


c 4063 


4848* 




16 


c 7654 


9824 (PI) 


9779*** 


17 


c 14469 


19043** 


19026*** 


18 


c 27434 


36423** 


36406*** 


19 


c 52155 


65536 F 




20 


c 99392 


131072 fT7|l 




21 


c 189829 


262144 ® 






Table 1: Lower and upper bounds, r = 1. 


n 


lower bound 


1st upper bound 


2nd upper bound 


3 


f 7 


7 B 




4 


g6 


6 H 




5 


a 6 


6 H 




6 


a 8 


8 H 




7 


h 14 


14 F 




8 


h 20 


21 F 




9 


m 26 


32* 




10 


i 41 


60* 




11 


i 67 


106** 




12 


i 112 


185** 




13 


i 190 


328** 




14 


i 326 


580** 




15 


i 567 


1032** 




16 


i 995 


1858** 




17 


i 1761 


3867 (HU) 


3785*** 


18 


i 3141 


7747 (HU) 


7609*** 


19 


i 5638 


14864 © 


14673*** 


20 


i 10179 


29728 © 


29346*** 


21 


i 18471 


59456 © 


58692 (EU) 



Table 2: Lower and upper bounds, r = 2. 
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n 


lower bound 


1st upper bound 


2nd upper bound 


4 


f 15 


15 B 




5 


i 9 


10* 




6 


a 7 


<Y* 




7 


a 8 


8* 




8 


a 10 


13* 




9 


a 13 


17* 




10 


a 18 


28* 




11 


a 25 


37** 




12 


a 39 


68** 




13 


a 61 


112** 




14 


a 95 


181** 




15 


a 151 


375 dm 


356*** 


16 


a 241 


736 flU]) 


700 * * * 


17 


a 383 


1448 (HU) 


1387*** 


18 


a 608 


2896 © 


2766*** 


19 


a 959 


5792 © 


5532*** 


20 


k 1593 


11584 (m 


11064 ((30J) 


21 


j 2722 


23168 (0 


22128 dH|) 




Table 3: Lower and upper bounds, r = 3. 


n 


lower bound 


1st upper bound 


2nd upper bound 


5 


f 31 


31 B 




6 


a 7 


18* 




7 


a 8 


14* 




8 


a 9 


13* 




9 


a 10 


14* 




10 


a 12 


16* 




11 


a 15 


20* 




12 


a 19 


34** 




13 


a 27 


48** 




14 


a 38 


76** 




15 


a 54 


156 (El 


142*** 


16 


a 77 


310 HMD 


272*** 


17 


a 121 


614 (HU) 


530*** 


18 


a 190 


1228 (US]) 


1045*** 


19 


a 304 


2432 (HH) 


2092 flU]) 


20 


a 489 


4864 (HI|) 


4180 d32]) 


21 


a 792 


9728 (HU) 


8360 dSU) 



Table 4: Lower and upper bounds, r = 4. 
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n 


lower bound 


1st upper bound 


2nd upper bound 


6 


f 63 


63 B 




7 


a 8 


35* 




8 


a 9 


22* 




9 


a 10 


17* 




10 


a 11 


19* 




11 


a 12 


19** 




12 


a 14 


25** 




13 


a 17 


28** 




14 


a 21 


60 (HH 




15 


a 28 


115 ([2TJ) 




16 


a 37 


224 (HSJ) 


127*** 


17 


a 53 


448 PJ 


232*** 


18 


a 77 


896 PJ 


454*** 


19 


a 112 


1792 dUl) 


909 ([331) 


20 


a 161 


3584 (US]) 


1819 (O 


21 


a 229 


7168 (HI 


3632 (H>]) 



Table 5: Lower and upper bounds, r = 5. 



4.6 Conclusion 

By mixing both heuristic and theoretical constructing arguments, we were 
able to present numerous upper bounds on M r (n), the smallest possible 
cardinality of an r-identifying code in F n : we first used heuristics for con- 
structions of codes from scratch, we then used some of these codes to build 
new codes with the help of Theorem [SJ after that, the computer possibly re- 
moved codewords from these codes, and eventually we reapplied Theorem [5j 

There still remains a large, challenging gap between the lower and upper 
bounds for most of the values of r, n in Tables 1-5. 

Of course, all these bounds are transposable to discriminating codes, by 
Corollary [3l 
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